Twig Pattern Matching Algorithms for XML
نویسندگان
چکیده
The emergence of XML promised significant advances in B2B integration. This is because users can store or transmit structure data using this highly flexible open standard. An effective well-formed XML document structure helps convert data into useful information that can be processed quickly and efficiently. From this point there is need for efficient processing of queries on XML data in XML databases like XML-enabled (MS-SQL Server, Oracle), Native XML (Mark Logic Server, EMC xDB). The research area in XML database is processing of XML tree pattern query (TPQ) called twig with efficient answers .Generally we have parsers that constructs the parse trees for some representation. Similarly, we have XML DOM parser it converts the XML document into XML tree. The XML query languages like XQL(XML Query Language), XML-QL(a query language for XML), Quilt, XPath (Extensible path language), XQuery (Extensible Query language) represent queries on XML data as twigs(small tree patterns).The major operation of XML query processing is to find all the occurrences of twig patterns efficiently on XML database. In the past few years, many algorithms have been proposed to match such tree patterns (twigs). This paper presents an overview of the state of the art in TPQ processing. This overview shall start by providing some background in holistic approaches to process TPQ and then introduce different algorithms for twig pattern matching.
منابع مشابه
Prefix Path Streaming: a New Clustering Method for XML Twig Pattern Matching
Searching for all occurrences of a twig pattern in a XML document is an important operation in XML query processing. Recently a class of holistic twig pattern matching algorithms has been proposed. Compared with the prior approaches, the holistic method avoids generating large intermediate results which do not contribute to the final answer. The method is CPU and I/O optimal when twig patterns ...
متن کاملA Hybrid Approach for General XML Query Processing
The state-of-the-art XML twig pattern query processing algorithms focus on matching a single twig pattern to a document. However, many practical queries are modeled by multiple twig patterns with joins to link them. The output of twig pattern matching is tuples of labels, while the joins between twig patterns are based on values. The inefficiency of integrating label-based structural joins in t...
متن کاملIRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES The Space Complexity of Processing XML Twig Queries Over Indexed Documents
Current twig join algorithms incur high memory costs on queries that involve child-axis nodes. In this paper we provide an analytical explanation for this phenomenon. In a first large-scale study of the space complexity of evaluating XPath queries over indexed XML documents we show the space to depend on three factors: (1) whether the query is a path or a tree; (2) the types of axes occurring i...
متن کاملTwig Pattern Matching: A Revisit
Twig pattern matching plays a crucial role in xml query processing. In order to reduce the processing time, some existing holistic onephase twig pattern matching algorithms (e.g., HolisticTwigStack [3], TwigFast [5], etc) use the core function getNext of TwigStack [2] to effectively and efficiently filter out the useless elements. However, using getNext as a filter may incur other redundant com...
متن کاملFast Matching of Twig Patterns
Twig pattern matching plays a crucial role in xml data processing. Existing twig pattern matching algorithms can be classified into two-phase algorithms and one-phase algorithms. While the two-phase algorithms (e.g., TwigStack) suffer from expensive merging cost, the onephase algorithms (e.g., TwigList, Twig2Stack, HolisticTwigStack) either lack efficient filtering of useless elements, or use o...
متن کامل